136 research outputs found

    FAIRness and Usability for Open-access Omics Data Systems

    Get PDF
    Omics data sharing is crucial to the biological research community, and the last decade or two has seen a huge rise in collaborative analysis systems, databases, and knowledge bases for omics and other systems biology data. We assessed the FAIRness of NASAs GeneLab Data Systems (GLDS) along with four similar kinds of systems in the research omics data domain, using 14 FAIRness metrics. The range of overall FAIRness scores was 6-12 (out of 14), average 10.1, and standard deviation 2.4. The range of Pass ratings for the metrics was 29-79%, Partial Pass 0-21%, and Fail 7-50%. The systems we evaluated performed the best in the areas of data findability and accessibility, and worst in the area of data interoperability. Reusability of metadata, in particular, was frequently not well supported. We relate our experiences implementing semantic integration of omics data from some of the assessed systems for federated querying and retrieval functions, given their shortcomings in data interoperability. Finally, we propose two new principles that Big Data system developers, in particular, should consider for maximizing data accessibility

    NASA GeneLab Space Omics Database: Expanding from Space to Ionizing Radiation Data on the Ground

    Get PDF
    NASA GeneLab is an open-access repository for omics datasets generated by biological experiments conducted in space or ground experiments relevant to spaceflight (e.g. simulated cosmic radiation, simulated microgravity, bed rest studies). The GeneLab Data Systems (GLDS) version 4.0 will be available on October 1st 2019, and will provide a state-of-the-art bioinformatics platform for the space biology and radiation communities to upload their data into an omics data commons, to process their data with vetted standard workflows and to compare with existing analyses. Started in 2015 as a repository designed to archive omics data from space experiments, GeneLab has expanded its scope to all ionizing radiation omics experiments conducted on the ground and has put considerable effort in providing carefully characterized radiation metadata on all datasets. GeneLab is also providing processed data derived from the raw data covering a large spectrum of omics (genome, epigenome, transcriptome, epitranscriptome, proteome, metabolome) to help users explore important questions: 1) Which genes or proteins are expressed differently in space for various living organisms? 2) What specific DNA mutations or epigenetic changes happen in space or after exposure to ionizing radiation? and 3) How does genetics affect these responses? Processed data available on GeneLab are derived by standard data analysis workflows vetted by hundreds of scientists who volunteered to join one of the four GeneLab Analysis Working Groups (Animal AWG, Plant AWG, Microbe AWG, Multi-Omics AWG). In this presentation, we will discuss how to bridge the gap between irradiation studies performed on earth and biological experiments conducted in space since the early 1990's. We will discuss how radiation dosimetry was estimated for datasets derived from samples collected during the Space Shuttle era on the International Space Station and on other orbiting platforms. Finally, we will address future strategies regarding dose monitoring in future missions into space, inter-agency efforts to unify data under one umbrella, and knowledge dissemination across the radiation research community and the space biology community

    GeneLab Analysis Working Group Kick-Off Meeting

    Get PDF
    Goals to achieve for GeneLab AWG - GL vision - Review of GeneLab AWG charter Timeline and milestones for 2018 Logistics - Monthly Meeting - Workshop - Internship - ASGSR Introduction of team leads and goals of each group Introduction of all members Q/A Three-tier Client Strategy to Democratize Data Physiological changes, pathway enrichment, differential expression, normalization, processing metadata, reproducibility, Data federation/integration with heterogeneous bioinformatics external databases The GLDS currently serves over 100 omics investigations to the biomedical community via open access. In order to expand the scope of metadata record searches via the GLDS, we designed a metadata warehouse that collects and updates metadata records from external systems housing similar data. To demonstrate the capabilities of federated search and retrieval of these data, we imported metadata records from three open-access data systems into the GLDS metadata warehouse: NCBI's Gene Expression Omnibus (GEO), EBI's PRoteomics IDEntifications (PRIDE) repository, and the Metagenomics Analysis server (MG-RAST). Each of these systems defines metadata for omics data sets differently. One solution to bridge such differences is to employ a common object model (COM) to which each systems' representation of metadata can be mapped. Warehoused metadata records are then transformed at ETL to this single, common representation. Queries generated via the GLDS are then executed against the warehouse, and matching records are shown in the COM representation (Fig. 1). While this approach is relatively straightforward to implement, the volume of the data in the omics domain presents challenges in dealing with latency and currency of records. Furthermore, the lack of a coordinated has been federated data search for and retrieval of these kinds of data across other open-access systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such meta-investigations are key to corroborating findings from many kinds of assays and translating them into systems biology knowledge and, eventually, therapeutics

    DNA Damage Response to Low and High-LET in a Large Cohort of Mice and Humans and Latest Advancement in NASA Space Omics

    Get PDF
    This presentation will first focus on a thorough evaluation of the DNA damage response to both low and high-LET in a cohort of 76 mice primary skin fibroblast derived from 15 different strains or in human blood mononuclear cells derived from 550 healthy donors. In both the human and mice work, we have hypothesized that DNA repair capacity can be used as a marker to evaluate and differentiate individual radiation sensitivity. More specifically, this work is based on the concept that the combined time-dose dependence of radiation-induced foci (RIF) of p53-binding protein 1 (53BP1) following low-LET exposure contains sufficient information to infer sensitivity to any other LET. This work is one of the most extensive studies on the kinetics and possible genetic underpinnings of radiation-induced DNA damage and repair. Results on humans are still preliminary as we are still in the process of collecting and isolating primary blood mononuclear cells from 500 to 800 healthy subjects of European descent, 18-75 years of age, 50/50 male/female distribution. We have analyzed 53BP1+ RIF formation as well as oxidative stress and cell death in primary cells from 192 subjects in response to the same HZE particles as used in mice: 600 MeV/n Fe, 350 MeV/n Ar and 350 MeV/n Si, 1.1 and 3 particles/100m2, 4 and 24 hours after irradiation. The second part of the talk will focus on describing GeneLab: The NASA Systems Biology Platform for Space Omics Repository, Analysis and Visualization. NASA GeneLab is an open-access repository for omics datasets generated by biological experiments conducted in space or experiments relevant to spaceflight (e.g. simulated cosmic radiation, simulated microgravity, bed rest studies). Started as a repository designed to archive precious omics from space experiments, GeneLab has expanded its scope to maximize the intelligibility of the raw data (e.g. RNAseq, microarray, WGBS, metagenome), particularly for users with limited bioinformatics knowledge. As such GeneLab is now providing processed data derived from the raw data covering a large spectrum of omics (genome, epigenome, transcriptome, epitranscriptome, proteome, metabolome), to help users explore important questions: Which genes or proteins are expressed differently in space for various living organisms? What are the consequences arising from these changes? What specifics DNA mutations or epigenetic changes happen in space? What species or genetic features lead to better adaption to such a unique environment? In this presentation, we will report on the current and future objectives for GeneLab, and review recent published studies relating molecular changes observed in various animal models and tissue with microgravity, radiation, circadian rhythm, hydration and carbon dioxide conditions

    Predicting Cancer Risk from Ionizing Radiation

    Get PDF
    The ability to predict cancer risk associated with exposure to low doses of high-LET ionizing radiation (IR) remains a challenge. Epidemiological methods lack the sensitivity and power to provide detailed risk estimates for cancer and ignore individual variance in IR sensitivity. We have hypothesized that DNA repair capacity can be used as a marker to evaluate and differentiate individual radiation sensitivity. More specifically, this work is based on the concept that the combined time-dose dependence of radiation-induced foci (RIF) of p53-binding protein 1 (53BP1) following low-LET exposure contains sufficient information to infer sensitivity to any other LET. Our hypothesis was tested in 15 different mouse strains as well as in primary human immune cells. We first approached individual ionizing radiation sensitivity in a mouse model by culturing primary skin fibroblasts extracted from 76 mice of 15 different genetic backgrounds and exposing them to HZE particles and X-rays. This work is one of the most extensive studies on the kinetics and possible genetic underpinnings of radiation-induced DNA damage and repair. Our results is in agreement with a DNA repair model we previously postulated, where nearby DNA double strand breaks (DSB) in the nucleus are brought together for more efficient repair, leading to RIF clustering. Such mechanism was evidenced by a specific dose and LET dependence of RIF numbers. Briefly, RIF quantification after low-LET X-ray exposure showed an asymptotic saturation for doses between 1 Gy and 4 Gy 4 hours post-irradiation across all 15 strains. The clustering of DSB across all strains also led to more RIF/Gy for lower LET (X-ray and 350 MeV/n Ar) than for higher LET (600 MeV/n Fe) 4 hours post-exposure. Considering the fact that the number of DSB/Gy should be independent of LET, our data suggest there are more DSB in individual RIF as the LET increases. RIF numbers for 24 and 48 hours post-exposure led to the inverse trend, with more remaining RIF/Gy for higher LET (by 600 MeV/n Fe). This result suggests cells have more difficulty resolving RIF from higher LET as they the number DSB/RIF increases. Note that for most conditions, the variance of RIF/Gy was small within individual animals of the same strain and large between strains, suggesting a strong genetics component. Furthermore, we present our preliminary data from an ongoing study on human genetic associations with IR sensitivity. To address the human variability in responses to HZE particle irradiation in a maximally comprehensive manner, we are in the process of collecting and isolating primary blood mononuclear cells from 768 healthy subjects of European descent, 18-75 years of age, 50/50 male/female distribution. We have analyzed 53BP1+ RIF formation as well as oxidative stress and cell death in primary cells from 192 subjects in response to the same HZE particles as used in mice: 600 MeV/n Fe, 350 MeV/n Ar and 350 MeV/n Si, 1.1 and 3 particles/100m2, 4 and 24 hours after irradiation. We will next complete the quantification of HZE particle-induced DNA and cellular damage in the remaining subjects and compare it to their responses to low-LET irradiation. Finally, we will perform GWAS analysis to identify human genomic associations with IR sensitivity and potential targets for biomarker development

    GeneLab: A Systems Biology Platform for Omics Analysis

    Get PDF
    NASA's GeneLab includes an open-access repository of some 200+ omics datasets generated by biological experiments relevant to spaceflight (including simulated cosmic radiation and microgravity). In order to maximize the intelligibility of these data, particularly for users with limited bioinformatics knowledge, GeneLab is now transforming the data in the repository into actual biological and physiological knowledge of the genetic and proteomic signatures found in these samples. This processed data is being derived by establishing standard data analysis workflows vetted by 114 scientists who are members of the four GeneLab Analysis Working Groups (Animal AWG, Plant AWG, Microbe AWG, Multi-Omics AWG). AWG members from institutes spanning the U.S. and four other countries participate on a voluntary basis. The AWGs meet monthly to discuss data mining, compare results and interpretations, and test forthcoming releases of the GeneLab Data Systems (GLDS). GLDS version 3.0 has been available to the general public since October 1st 2018, and has been providing a professional state-of-the-art bioinformatics platform for everyone in the space biology community to upload their data into a space biology omics data commons, to process their data with vetted standard workflows and to compare to existing analyses. The user interface for the platform is being designed to be accessible to a broad variety of users including those with limited bioinformatics experience, including high school and college students who can use it to learn about omics data analysis and space biology. As such, Genelab will constitute a powerful general public outreach capability of NASA and the Space Biology community at large. Data mining of the GeneLab database by the AWG has already started generating very interesting findings, including reports linking specific spaceflight conditions such as radiation, microgravity or carbon dioxide levels to molecular changes seen across various species. In this presentation, we will report on the current and future objectives for GeneLab, and review recent studies reported by the various AWGs relating molecular changes observed in various animal models and tissue with microgravity, radiation, circadian rhythm, hydration and carbon dioxide conditions

    FAIRness and Usability for Open-Access Omics Data Systems

    Get PDF
    Omics data sharing is especially crucial to the biological research community, and the last decade or two has seen a huge rise in collaborative analysis systems, databases, and knowledge bases for omics and other systems biology data. We assessed the "FAIRness" of NASA's GeneLab Data Systems (GLDS) along with four similar kinds of systems in the research omics data domain, using 14 FAIRness metrics. 14 metrics. The range of Pass ratings was 29-79% of the 14 metrics, Partial Pass 0-21%, and Fail 7-50%. The range of overall FAIRness scores was 5-12 (out of 14). The systems we evaluated performed the best in the areas of data findability and accessibility, and worst in the area of data interoperability. We propose two new principles that Big Data systems, in particular, should consider for increasing data accessibility. We relate our experiences implementing semantic integration of omics data from several systems for the federated querying and retrieval functions of the GLDS, given the shortcomings in data interoperability of these systems

    NASA's GeneLab Phase II: Federated Search and Data Discovery

    Get PDF
    GeneLab is currently being developed by NASA to accelerate 'open science' biomedical research in support of the human exploration of space and the improvement of life on earth. Phase I of the four-phase GeneLab Data Systems (GLDS) project emphasized capabilities for submission, curation, search, and retrieval of genomics, transcriptomics and proteomics ('omics') data from biomedical research of space environments. The focus of development of the GLDS for Phase II has been federated data search for and retrieval of these kinds of data across other open-access systems, so that users are able to conduct biological meta-investigations using data from a variety of sources. Such meta-investigations are key to corroborating findings from many kinds of assays and translating them into systems biology knowledge and, eventually, therapeutics

    NASA's GeneLab: An Integrated Omics Data Commons and Workbench

    Get PDF
    GeneLab (http://genelab.nasa.gov) is a NASA initiative designed to accelerate open science biomedical research in support of the human exploration of space and the improvement of life on earth. The GeneLab Data Systems (GLDS) were developed to help investigators corroborate findings from omics (genomics, transcriptomics, proteomics, and metabolomics) assays and translate them into systems biology knowledge and, eventually, therapeutics, including countermeasures to support life in space. Phase I of the project (completed) emphasized developing key capabilities for submission, curation, storage, search, and retrieval of omics data from biomedical research in and of space environments. The development focus for Phase II (completed) was federated data search and retrieval of these kinds of data from other open-access repositories. The last phase of the project (in work) entails developing an omics analysis tool set, and a portal to visualize processed omics data, emphasizing integration with the data repository and search functions developed during the prior phases. The final product will be an open-access system where users can individually or collaboratively publish, search, integrate, analyze, and visualize omics data
    corecore